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ABSTRACT 



A multimedia search and indexing system automatically 
selects scenes or events of interest from any media, i.e., 
video, film, sound for replay, in whole or in part, in other 
contexts. The entire audio track of a recorded event in video, 
film, sound, etc., is analyzed to determine audio levels 
within a set of frequency ranges of interest. Audio clip levels 
within the selected frequency ranges are chosen as audio 
cues representative of events of interest in the track. The 
selection criteria are applied to the audio track of the 
recorded event. An Edit Decision List (EDL) is generated 
from the analysis of the audio track. The list is representative 
of scenes or sounds of interest as clips for reuse. The clips 
are reviewed and accepted or rejected for reuse. Once 
selected, the clips are edited using industry standard audio 
and video editing techniques. 

21 Claims, 7 Drawing Sheets 
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MULTIMEDIA SEARCH AND INDEXING back multimedia events and includes recording sources, a 

FOR AUTOMATIC SELECTION OF SCENES preprocessor, a delivery processor, and user control units. 

AND/OR SOUNDS RECORDED IN A MEDIA The system records and plays back multimedia events which 

FOR REPLAY BY SETTING AUDIO CLIP entails capturing tracks of various aspects of a multimedia 

LEVELS FOR FREQUENCY RANGES OF 5 eyent . codjng ^ tfacks mt0 digitized blocks; ume stamping 

INTEREST IN THE MEDIA each bk)ck; md compressing and pre -processing each track 

CROSS-REFERENCE TO RELATED as instructed in a source mapping table; transmitting tracks 

APPLICATION of the multimedia event to the user as requested; and 

Urn is a divisional application under 37 C.ER. 1.53(b) of 10 adjusting the delivery track based upon relative time infor- 

copending U.S. patent application Ser. No. 09/107,389, filed maUon coated with the new position established after 

on Jun. 30, 1998, now U.S. Pat. No. 6,163,510. search throu S h a track of the multimedia event. 

U.S. Pat. No. 5,621,658 issued Apr. 15, 1997, and filed 

BACKGROUND OF THE INVENTION Jul 13? 1993) disc i oses communicating an electronic action 

1. Field of Invention 15 from a data processing system via an audio device. At the 

This invention relates to information systems. More sending data processing system, an action is converted to a 

particularly, the invention relates to multimedia search and pre-determined audio pattern. The action may be combined 

indexing systems for automatic event selection for replay with text converted into an audio message and contained in 

using audio cues and signal threshold levels. 20 an electronic mail object. The audio patterns are then 

2 Description of Prior Art communicated to the audio device over telephone lines or 

j . . „ . , . . c . other communication means. At the receiving end, the audio 

In managing intellectual property assets for maximum , . , , . 

t • . i ■ * i . i device records the obiect. A user can provide the recorded 

return, it is common m the media industry to re-purpose , . J . f 

, , t object to a data processing system which then executes the 

assets, particularly video and sound recording assets, in ?c \ , r . 

, , j * a 1 e action and converts the text audio patterns back to text. In 

whole or in part, into other products. An example of a . , . » , 

jx i \a i_ r i * j j. addition, the action can be converted to text and displayed 

re-purposed asset would be, for example, a video recording r J 

c 4 . . . 4 , . . , . on the data processing system, 

of a sporting event shown on television; a portion later r & J 

included in a commercial; and multiple clips used for news None of me P rior art discloses re-purposing intellectual 

or highlight recaps of the event as well as in a CD-ROM 30 P ro P ert y> e -g> ^ Q a ° d sound, in which certain events or 

game. Given the need to maximize asset return, the content sound m one context are automatically selected for use in or 

owner is faced with the problem of finding the desired ^ another context, where the selected events correlate 

sections of video or audio materials within a given asset or ^ ^ scenes ^ 501111(35 ^ or ^ me other context, 

assets. This is the case whether the asset is stored in a 3S SUMMARY OF THE INVENTION 
computer system or on traditional analog media such as 

magnetic tape or film. The state of the art for identifying ^ ob J ect of the mention is a system and method for 

events for re-purposing is automatic scene change detection. selectm 8 **** of mterests in ^ event in one context for 

This technology identifies the first frame of a sceue that is incorporation in, or with another context, as a new or 

dramatically different than the preceding scene. However, 40 modified product. 

changes of scene may not be well correlated with the section Another object is a system and method for automatically 

of media that is desired for re-purposing. For example, in a selecting and correlating scenes of interest in one context, 

fast moving game like hockey, the events, such as a goal for incorporation in or with another context, as a new or 

scored or goal missed, or a key player returning to the ice, 4S modified product using audio cues for such selection and 

may not constitute a change of scene. correlation. 

What is needed is a mechanism for automating the Another object is a system and method for automatically 

selection of scenes of interest in an event in one context for selecting and correlating scenes of interest in one context 

re-purposing in another context in which the selected events usin g audio c^ ^ si &*^ level thresholds for incorpora- 

correlate with the scenes and sounds and context of another 50 tion of me selected scenes in other contexts as a new or 

media product. modified product. 

Prior art related to re-purposing intellectual property Another object is a system and method for logically 

includes the following: combining different audio cues in selecting scenes of interest 

U.S. Pat. No. 5,713,021 issued Jan. 18, 1998 and filed 55 in one context for use in different contexts. 

Sep. 14, 1995, discloses a multimedia system which facili- Another object is a system and method for creating an 

tates searching for a portion of sequential data. The system Edit Decision List identifying scenes of interest selected in 

displays neighboring data depending on a requirement when °ne context for use in another context using audio cues and 

displaying the portion of the data. A view object manage- signal thresholds. 

ment means searches view objects stored in a view object 60 Another object is a system and method for establishing 

storage means depending on a specification of features of a "start" and "stop" times in an Edit Decision List for selection 

portion of that data. A display/reproduction means displays of scenes of interest in one context to be used in different 

and reproduces a portion of data corresponding to the view contexts. 

searched by the view object means. 65 These and other objects, features and advantages, are 

U.S. Pat. No. 5,613,032 issued Mar. 18, 1997, and filed achieved in a multimedia search and indexing system which 

Sep. 2, 1994, discloses a system for recording and playing automatically selects events or scenes of interest from any 
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media — video, films, sound — for replay in whole, or in part, 
in other contexts, as a new or modified product. The entire 
audio track of a recorded event in video, film, sound, etc., is 
analyzed to determine audio levels or cues within a set of 
frequency ranges of interest. The frequency ranges indicate 5 
different sounds, e.g. a referee whisUe; loud shouting or 
clapping; a bell sound, etc., each sound having a distinctive 
frequency and indicative of a scene of interest which cor- 
relates with a highlight in an event when occurring at a 10 
defined audio clip level. Alternatively, the sound level may 
drop dramatically as indicative of a scene of interest. Mul- 
tiple frequency ranges may be analyzed for audio cues in 
refining the identification of a scene of interest. An Edit 
Decision List (EDL) of scenes of interest is generated from 15 
the analysis of the audio track in which the frequency ranges 
and audio levels match the criteria for a scene of interest. 
The list includes "start" and "stop" times related to the time 
codes in the track of the media for locating the scenes of 20 
interest as a visual clip. The visual clips are reviewed and 
accepted or rejected for re-purposing. Once selected, the 
visual clips are edited using industry standard audio and 
video editing techniques. 

25 

DESCRIPTION OF DRAWING 

The foregoing objects, features and advantages will be 
further understood from a detailed description of a preferred 
embodiment taken in conjunction with the appended 30 
drawing, in which: 

FIG. 1A is a block diagram of an illustrative system for 
multimedia searching and indexing using audio cues and 
signal level thresholds and incorporating principles of the ^ 
present invention. 

FIG. IB is an alternative system for multimedia searching 
and indexing using audio cues and signal level thresholds. 

FIG. 2 is a representation of a visual tape and accompa- 
nying sound track indicating events of interest for 40 
re-purposing in another context as a new or modified prod- 
uct. 

FIG. 3 is a flow diagram of a selection process for scenes 
of interest in the visual media of FIG. 2 using the system of 45 
FIG. lAor B. 

FIG. 4 is a flow diagram of an audio analysis conducted 
in the process of FIG. 3. 

FIG. 5 is a flow diagram for setting audio parameters for 50 
selection of scenes of interest in the process of FIG. 3. 

FIG. 6 is a flow diagram for creating an Edit Decision List 
(EDL) in the process of FIG. 3. 

FIG. 7 is a reproduction of an Edit Decision List (EDL). 55 

DESCRIPTION OF PREFERRED EMBODIMENT 

In FIG. 1A, a system 10 is shown for automatically 
identifying and selecting scenes or sounds of interest in a 
media using audio cues and signal level thresholds for 60 
re-purposing the media. The system includes a means of 
listening to or viewing source material on a tape transporter 
11, such as a conventional tape drive or other equipment in 
which a visual or sound media 12, e.g film, video disk, 65 
compact disk is loaded and moved back and forth according 
to an editor's needs in selecting scenes or sounds of interest 
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for re -purposing. An analog signal on the tape is transferred 
to an analog/digital converter 13 for conversion into a digital 
counterpart by well-known methods, e.g., pulse amplitude 
modulation. A digital signal on the tape or the converted 
analog signal is provided to a programmable digital filter 14. 
The programmable digital filter 14 is responsive to the 
digital signal in conjunction with a digital filter program 15 
stored in a random access memory 16, The digital filter 
program 15 in conjunction with the filter 14 selects fre- 
quency ranges in the analog signal of interest to an editor. 
The memory 16 is coupled through a system bus B to a 
system processor 18, a display 19, a storage disk 20. The 
memory also includes a standard operating system, an 
analysis program 21 for identifying scenes of interest in the 
media 12; a parameter setting program 22 for automatically 
setting audio levels or cues representative of scenes of 
interest in the media 12; and an edit decision list program 23 
which provides "start" and "stop" time codes in the media 
for scenes of interest as a basis for an editor to select the 
scene, display it on the monitor 19, and incorporate the scene 
into a modified or new product using conventional editing 
processes. The analysis program 21; parameter setting pro- 
gram 22; and edit decision list program 23 will be described 
hereinafter in implementing the method of the invention in 
the system 10. 

In FIG. IB an alternative system for multimedia searching 
and indexing using the analysis program 21; parameter 
setting program 22 and edit decision list program 23 
includes a standard video tape recorder 11' and a standard 
oscilloscope 14' as substitutes for the transporter 11, A/D 
converter 13 and programmable filter 14 in providing the 
audio signal from the media 12 to the system bus B for 
processing in the manner to be described hereinafter for both 
FIGS. 1A and IB. 

As an illustrative example of re-purposing, FIG. 2 shows 
an event of interest, for example a football game, as recorded 
on a videotape 20 and containing a video clip 21 having 
scenes of interest for re-purposing in another context. In one 
embodiment, the clip 21 contains scenes of a touchdown 22 
and an interception 24. The tape 20 includes a soundtrack 26 
which records the sound levels accompanying the scenes. 
The taped scenes and soundtrack are accompanied by time 
codes 28 included in the tape. The time codes are industry 
standard time codes used to navigate the tape. The sound 
signal levels are selected for a clip level or threshold 29 
based on past experience. Signal levels exceeding the 
threshold are used to identify a scene for re-purposing as will 
be described in conjunction with FIGS. 3-6. 

In another embodiment, sound levels equal to or less than 
a threshold may be indicative of a scene or sound of interest. 
For example when a factory shuts down and the power 
equipment stops running, a dramatic drop in sound would 
occur indicative of a scene or sound of interest. However, for 
purposes of description of the invention, the cases of sounds 
exceeding a threshold will be described. 

In FIG. 3, the entire audio track under investigation is first 
analyzed to determine the audio levels within a" set of 
frequency ranges of interest in a step 30. An editor selects 
desired frequency ranges and analysis granularity. Analysis 
granularity refers to the length of intervals to be examined. 
For example, a granularity of one second means that each 
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second of media will be analyzed separately. For some ranges are of the order of ten times the amplitude greater 

applications, the granularity of an analysis may be preset. than the steady-state sound level. The duration of the sound 

Frequency ranges may be set to recognize things such as of the interest can range from less than one second in the 

applause, the roar of crowds, the blowing of a whistle, etc. case of bullet shot or 10's of seconds in the case of the roar 

Certain of these ranges are representative of highlights in the 5 of the crowd responding to a sporting event, 

event recorded in the tape. For each frequency, each time [ n a stcp 42, an editor selects an analysis granularity or 

interval is analyzed and the audio level and time code time-length of intervals in seconds (S) for examination. For 

recorded. When all frequencies have been analyzed for each example, a granularity of 1 second means that each second 

time interval, the analysis is complete. 10 0 f media will be analyzed separately. With some 

In a step 50, selection criteria are chosen, such as audio applications, the granularity of analysis may be preset, 

clip levels within frequency ranges. The parameters are [ n step 43, determines the time length (G) of the event on 

selected for scenes of interest which correlate to the the tape to be analyzed. 

highlight(s) in an event. For each desired frequency range, ^ In step 44 the editor calculates the number of analysis 

several parameters are recorded. The audio level at which intervals by the relation G/S. For each interval, the corre- 

scenes are to be selected is chosen. Two time parameters, sponding time code and audio level are recorded for each 

"P" and "F", are also chosen. "P" represents the number of frequency 

seconds preceding the attainment of a threshold level which {n ^ ^ ^ m ^ ^ moved tQ ^ ^ code fof ^ ^ { 

are to be included in a candidate clip for re-purposing. "F" 20 a^y^ interval 

represents the number of seconds following the attainment ¥ ^ , ^ ,0,..,. 

c 4l _ v , , , . , 4 , . , , , 7 . In step 46, the soundtrack is filtered for desired frequency 

of the clip level which are to be included in the candidate . . „^ T ^ ^ „ „ . - 

i- m. j-j * *• * u • c tL ranges using the system of FIG. 1 A or B. For each frequency 

clip. The candidate creation parameters are basic for the ^ • 

. t . c 4 . c . 4 * . 4 . . 4 . range the audio level is measured in a step 47. 

selection of the scenes of interest. Other selection criteria, * r 

such as total time desired for the aggregation of all candidate 25 ^ interval, frequency range, audio level and time code 

clips and more complex relations between the frequencies are recorded for subsequent use in step 48. The tape is 

may also be chosen. Aggregation criteria may also be used, moved t0 the Ume ^ for tne next interval in a ste P 49 and 

e.g. Exclusive OR, AND, and/or relations between the ±e P rocess 15 re P eated until a ^ 50 mdicates the last 

attainment of audio clip levels within different frequency 30 interval has been ana ^ zed at whic * time the analysis ends. 

ranges. Th e process of setting parameters for the selection of 

In a step 70, the selection criteria in step 50 are applied to scenes of ^ audio 01X65 * described m more detail 

the results of the analysis done in step 30 and result in a 1x1 FIG * 5 * ^ P rocess * started in a ste P 51 in which the 

candidate Edit Decision List (EDL). In step 70, for each editor select ? a first Muency ra *g e for setting parameters 

analysis interval and frequency range desired, the recorded 35 to ldentlf y scenes of mterest 

audio level is compared with the parameters obtained from In ste P 52 > ^ editor selects me audio cli P level ( A ) at 

the step 50. The comparison generates candidate time codes which scenes m to be selected for the first frequency range, 

for inclusion in the EDL. The list of time codes is then In step 53, the editor selects a time interval (?) in seconds 

decomposed into a set of intervals representing the candidate 40 leading the audio threshold event for the frequency range 

clips. As shown in FIG. 7, each clip is represented by a being investigated. 

"start" and "end" time code. In step 54, the editor selects the a time interval (F) in 

In a step 90, an editor can use the "start" and "end" time seconds following the audio threshold event for the selected 

codes to navigate into an appropriate portion of the media ^ frequency range. 

and examine the candidate clip including the audio. The In step 55, the next frequency range is selected. In a test 

editor may choose to modify the parameters and generate 56, the process returns to step 52 if the last frequency range 

alternate lists of candidate clips depending on the accept- has not had parameters assigned. The process for setting 

ability of the selection. parameters for the selection of scenes of interest ends when 

Other audio cues may be used to further refine the 50 lne last frequency range has been classified, 

selection of the EDL. For example, if action is desired, the The process of creating candidate scenes for the EDL is 

video may be analyzed for motion, and this analysis cross- further described in FIG. 6 in which a comparison is made 

referenced with the audio analysis. Another example would of the recorded audio level with the parameters set in FIG. 

cross-reference fixed text word recognition with the analy- 55 4 to generate candidate time codes for inclusion in the EDL 

sis. In this case, recognition of words such as "touchdown" for each analysis interval and desired frequency range, 

and "interception" within a given time range could be used The process for creating the EDL is started in a step 71 in 

to validate the appropriateness of candidate video clips. In which the media is set for the first interval, 

such case, the EDL can reflect which key words have been i n step 72, the first frequency range of the first interval is 

observed with which clip. 60 prov ided to a comparator in a step 73 in which the recorded 

Now turning to FIG. 4, the audio analysis of step 30 will audio level is compared with the target audio clip level, 

be described in more detail. A test 74 is performed to determine whether the audio clip 

In FIG. 4, an audio analysis is started in a step 41 in which level has been reached. A "no" condition moves the program 

an editor selects desired frequency ranges (F) to identify 65 to entry point A which will be described hereinafter. A "yes" 

scenes of interest in the soundtrack, such as applause, the condition indicates that this interval contains an audio level 

roar of the crowd, blowing of a whistle, etc. Typically, these in a frequency range which has exceeded the audio clip level 
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or signal threshold and represents a scene of interest. The 
associated time code (TC) in the interval containing the 
scene of interest is recorded in the EDL in a step 75. 

In step 76, the parameter P is subtracted from the first 
interval and a test 77 is performed to determine if the time 
of the time code minus P is less than the time code for the 
start of the media. A "yes" condition initiates a step 78 to 
replace the time code minus the parameter P for the analyzed 
interval with the time code for the start of the media, after 10 
which the program moves to step 79. Similarly, a "no" 
condition moves the program to step 79 in which the interval 
from time (TC-P) to the time code (TC) is entered in the 
EDL for the first analysis, after which, a step 80 adds the F 
interval to the time code recorded in the EDL for the 1 
frequency range analyzed in the first interval. 

A test 81 is performed to determine if the time code for the 
event recorded in the EDL+the F parameter exceeds the time 
code for the end of the media. A "yes" condition initiates a 20 
step to replace the time code of the recorded event+the F 
parameter with the time code for the end of the media, after 
which the program moves to a step 83. Similarly, a "no" 
condition moves the program to the step 83 in which the 
interval time code+the F parameter is recorded in the EDL 
as a stop code for the event of interest. 

In step 84 the program is set for the next frequency in the 
interval. Step 84 is also the entry point for node A in which 
frequencies which do not exceed the audio clip level are 30 
returned for analysis of the subsequent frequency range. A 
test 85 determines if the last frequency range has been 
completed for the interval. A "no" condition moves the 
program to entry point B which enters step 73 to compare 35 
the audio levels in the subsequent frequency range and 
determine "start" and "stop" time codes for scenes of 
interest as suggested by the subsequent frequency range. 

Those intervals exceeding the audio clip levels are 
recorded in the EDL along with "start" and "stop" codes as 40 
described in conjunction with steps 77-84. 

A "yes" condition for test 85 initiates a step 86 in which 
the tape is moved to the next interval for frequency analysis. 

A test 87 determines whether or not the last interval has 45 
been analyzed. A "no" condition moves the program to entry 
point C which enters step 72 to set the first frequency range 
in the next interval, after which the process is continued for 
identifying scenes of interest in each frequency range and 
recording the selected scenes in the EDL with their "start" 50 
and "stop" codes per steps 77-83. The process is repeated 
until the last interval and the last frequency range thereof 
have been examined for scenes of interest. The scenes are 
recorded in the EDL for "start" and "stop" codes when 55 
appropriate. When the last interval has been analyzed, the 
test 87 indicates a "yes" condition which initiates a step 88 
in which the editor determines the contiguous intervals 
which will be used in the re-purposing of the selected 
scenes. A step 89 formats the time intervals for use in manual 60 
review of the scenes by the editor after which the process 
ends. 

FIG. 7 shows the EDL for the scenes of interest. Each 
scene is entered in the EDL with a highlight number, "start" 65 
time, and "end" time, which the editor can use to navigate 
the appropriate portion of the media and view the candidate 
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clip. The editor may choose to modify the parameters and 
generate alternate lists of candidate clips depending on the 
acceptability of the suggestions. If the clips are accepted, 
they may be edited using industry standard audio and video 
editing techniques for their incorporation in new or modified 
products which maximizes the investment in the intellectual 
property assets represented by the video clips. 

In summary, the present invention provides a system and 
method for automatically selecting scenes of interest as 
visual clips in a media, e.g., herein video, film, sound, etc., 
using audio cues and signal thresholds. The selected clips 
may be re-purposed in new, improved or modified products, 
thereby maximizing the investment return on the intellectual 
property asset represented by the clips. A method of select- 
ing the scenes involves analyzing the audio track associated 
with the visual portion of the media for audio levels exceed- 
ing thresholds identified for the different frequencies and 
intervals of the media. These audio cues are used to identify 
visual clips incorporating scenes of interest. Each clip is 
associated with a "start" and "stop" code in which the audio 
cue has been detected as exceeding a threshold. The selected 
scenes are recorded in an Edit Decision List (EDL) which 
enables an editor to review the visual clips and re-purpose 
the clips into new or modified products. 

While the invention has been described in conjunction 
with a specific embodiment, modifications can be made 
therein without departing from the spirit and scope of the 
invention as defined in the appended claims, in which: 

We claim: 

1. In a signal processing system including a multi media 
search and indexing system for automatic selection of scenes 
or sounds recorded in a media for replay in other contexts, 
a method for setting audio clip levels in analyzing the media 
for a set of frequency ranges of interest for replay, compris- 
ing the steps of: 

(a) selecting an audio clip level for each frequency range 
as indicative of a scene or sound of interest in the 
media; 

(b) selecting a time interval in seconds leading an audio 
level exceeding the clip level; 

(c) selecting a time interval in seconds following the 
exceeded audio clip level; 

(d) repeating steps (a), (b), and (c) for each frequency 
range; and 

(e) recording and relating each scene of interest exceeding 
the audio clip level to the index in the media. 

2. The method of claim 1 further comprising the step of: 

(f) comparing the recorded audio clip level with target clip 
level. 

3. The method of claim 1 further comprising the step of: 

(g) determining if the audio clip level was reached. 

4. The method of claim 1 further comprising the step of: 

(h) recording an associated time code when the audio clip 
level is reached. 

5. The method of claim 1 further comprising the step of: 
(j) cross-referencing fixed text word recognition in the 

frequency range with the audio clip level. 

6. The method of claim 1 further comprising the step of: 
(k) assigning a time code to the scene of interest exceed- 
ing the audio clip level. 

7. The method of claim 1 further wherein the selection of 
the desired frequency ranges is the selection of one fre- 
quency range. 



3/10/05, EAST Version: 2.0.1.4 



US 6,452 : 

9 

8. The method of claim 7 wherein the one frequency range 
is a human frequency range. 

9. The method of claim 7 wherein the one frequency range 
is an entire audio spectrum. 

10. The method of claim 7 wherein the one frequency 5 
range is the system capacity. 

11. A program medium executable on a computer system 
for automatic selection of scenes or sounds recorded in a 
media for replay in other contexts, comprising: 10 

(a) program code in the medium selecting an audio clip 
level for each frequency range as indicative of a scene 
or sound of interest in a media; 

(b) program code selecting a time interval leading an 
audio level exceeding the clip level; 15 

(c) program code selecting a time interval following the 
exceeded audio clip level; 

(d) program code in a medium for repeating steps (a), (b) 
and (c) for each frequency range; 20 

(e) program code in a medium for recording and relating 
each scene of interest exceeding the audio clip level to 
an index in the media. 

12. A system for automatic selection of scenes or sounds 
recorded in the media for replay in other contexts, compris- 25 
ing: 

(a) means selecting an audio clip level for each frequency 
range as indicative of a scene or sound of interest in the 
media; 

(b) means selecting a time interval leading an audio level 
exceeding the clip level; 
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(c) means selecting a time interval following the exceeded 
audio clip level; and 

(d) means recording and relating each scene of interest 
exceeding the audio clip level to an index in the media. 

13. The system of claim 12 further comprising: 

(e) means for comparing the recorded audio clip level 
with target clip level. 

14. The system of claim 12 further comprising: 

(f) means for determining if the audio clip level was 
reached. 

15. The system of claim 12 further comprising: 

(h) means for recording an associated time code when the 
audio clip level is reached. 

16. The system of claim 12 further comprising: 

(j) means for cross-referencing fixed text word recogni- 
tion in the frequency range with the audio clip level. 

17. The system of claim 12 further comprising: 

(k) means for assigning a time code to the scene of interest 
exceeding the audio clip level. 

18. The system of claim 12 further wherein the selection 
of the desired frequency ranges is the selection of one 
frequency range. 

19. The system of claim 18 wherein the one frequency 
range is a human frequency range. 

20. The system of claim 18 wherein the one frequency 
range is an entire audio spectrum. 

21. The system of claim 18 wherein the one frequency 
range is the system capacity. 

+ + * + + 
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